The X-tree : An Index Structure for High-Dimensional Data

نویسندگان

  • Stefan Berchtold
  • Daniel A. Keim
  • Hans-Peter Kriegel
چکیده

In this paper, we propose a new method for indexing large amounts of point and spatial data in highdimensional space. An analysis shows that index structures such as the R*-tree are not adequate for indexing high-dimensional data sets. The major problem of R-tree-based index structures is the overlap of the bounding boxes in the directory, which increases with growing dimension. To avoid this problem, we introduce a new organization of the directory which uses a split algorithm minimizing overlap and additionally utilizes the concept of supernodes. The basic idea of overlap-minimizing split and supernodes is to keep the directory as hierarchical as possible, and at the same time to avoid splits in the directory that would result in high overlap. Our experiments show that for high-dimensional data, the X-tree outperforms the well-known R*-tree and the TV-tree by up to two orders of magnitude.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PK - TREE : A SPATIAL INDEX STRUCTUREFOR HIGH DIMENSIONAL POINT DATAWei

In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting i...

متن کامل

Chapter 1 PK - TREE : A SPATIAL INDEX STRUCTUREFOR HIGH DIMENSIONAL POINT DATAWei

In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting i...

متن کامل

Eco-Efficiency Evaluation in Two-Stage Network Structure: Case Study: Cement Companies

The cement industry, as a primary trade, plays an important role in the development of a country's organization. This industry in Iran, however, despite of profuse benefits such as high-value mines, faces many challenges. Problems such as exploitation of the production require the need for doing research into this area. The main purpose of this paper is to examine the Eco-efficiency in Iran's 2...

متن کامل

PK-tree: A Spatial Index Structure for High Dimensional Point Data

In this paper we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting ind...

متن کامل

The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

Feature based similarity search is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high dimensional feature space which is indexed using a multidimensional data structure. Similarity search then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996